perm filename LAMBDA[BOO,JMC]3 blob sn#529005 filedate 1980-08-11 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00003 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	.if false then begin "notes"
C00004 00003	.ss(lambda,LAMBDA CALCULUS)
C00023 ENDMK
C⊗;
.if false then begin "notes"
1. λ-expressions and λ-conversion
2. statement but not proof of Church-Rosser theorem
3. universality of λ-calculus via simulation of LISP using Scott
devices.
4. remarks including
	just because you can do without explicit conditional expressions
or something else doesn't mean it's practical or even theoretically
preferable  do so
	Church's inconsistent logic
	Scott's models
	lcf
	combinators

	This is more general, but a system that realized it
on a computer would most likely do everything in a more complicated
and slower way for the benefit of rarely utilized generality.  We
say "most likely", because someone may invent a way of getting the
generality without paying any operational penalty.

See lambda.lsp[e80,jmc] for program to normalize λ-expressions
and do pure LISP in λ-calculus.
.end "notes"
.ss(lambda,LAMBDA CALCULUS)

	The lambda notation used in LISP was taken from the lambda
calculus developed by Alonzo Church in the late 1920s
and described in his 1940 book, %2The Calculi
of Lambda Conversion%1.  This section covers some aspects of the
lambda calculus that go beyond what is used in LISP.

	Each LISP expression either evaluates to some object 
(like %2qa_x%1)
or it
represents a function (like %2λx.[qa_x_._y]%1).
[In the terminology of logic, it has a definite
type, but the word "type" is often used differently in computer science
where numbers and list structure are considered of different types.
Logicians use the word "sort" for the computer scientist's "type"].
The lambda calculus
doesn't make this distinction, and any expression can be taken as
a function.  A lambda calculus evaluator would leave many expressions
unevaluated.

	Lambda calculus is much simpler than LISP in its basic structure,
and its well-formed formulas (abbreviated wffs) are constructed as follows:

.item←0
	#. A variable is a wff.

	#. If ⊗f and ⊗e are wffs, then so is ⊗f(e). 

	#. If ⊗v is a variable and ⊗e is a wff,
then ⊗(λv.e) is a wff.
.<<should we introduce a list form of λ-expression here?>>

	#. An expression is a wff only if its being a wff follows
from the above rules.

	Here are some examples.

.item←0
	#. ⊗x and ⊗y are wffs by rule 1.

	#. ⊗x(x), ⊗x(y), ⊗y(x) and ⊗(x(y))(x) are wffs.  The same
symbol can and often does appear in both function and argument
positions.

	#. ⊗(λx.(λy.x)), ⊗(λx.(λy.y)) and ⊗(λf.(λx.f(x(x)))(λx.f(x(x)))) 
are all wffs that will be important in subsequent developments.

	The occurrences of variables in a λ-expression may be divided into
free and bound occurrences.  Any occurrence of ⊗x within an
expression ⊗(λx....) is a bound occurrence.  The expression represented
by ... is called the scope of this occurrence of ⊗λx.  Other occurrences
are called free, and sometimes the variables themselves are called free
or bound.  This is a misnomer, because a given variable may have both
free and bound occurrences in a particular wff.

	Just as in LISP
or in logic or in definite integrals in calculus, bound variables may
be changed without changing the meaning of the expression provided all
occurrences are changed and no variable that is free in a subexpression
becomes bound.  Thus we may change ⊗(λx.(λy.x)) to ⊗(λu.(λy.u)) without
changing the meaning of the expression.  However, we could not change
the ⊗x in ⊗(λx.(λy.y)) to ⊗y without changing the ⊗y to something
else first, because this would give ⊗(λy.(λy.y)) which replaces the
second occurrence of ⊗x by a variable which is now bound by the
second λ.  This kind of change of variables is called α-conversion.

	The second kind of conversion is λ-conversion.  It can be
done when a wff consists of a λ-expression applied to another wff.
An example is ⊗(λx.x(y))(u), and the general form is
⊗(λx.e1)(e2).  The conversion consists essentially in replacing
the given expression by the result of substituting ⊗e2 for
⊗x in ⊗e1.  Thus the above-mentioned ⊗(λx.x(y))(u) converts to
⊗u(y).  There is a catch.  Suppose we want to convert
⊗(λx.(λy.x(y)))(y(y)).  If we simply subsitute ⊗y(y) for ⊗x in
⊗(λy.x(y)), the free occurrences of ⊗y in ⊗y(y) will become bound
giving ⊗(λy.y(y)(y)), and this isn't allowed.  When a λ-conversion
will result in the "capture" of a free occurrence of a variable,
it must be preceded by a change of bound variables.  In the above
example, we can replace the bound occurrences of ⊗y before doing
the λ-conversion and get first ⊗(λx.(λu.x(u)))(y(y)) and then
⊗(λu.y(y)(u)).  We find it convenient to regard the preliminary
α-conversion that may be needed as part of the λ-conversion
rather than as a precondition.

	We write ⊗e1_→_e2 as a way of saying that ⊗e1 converts
to ⊗e2.  We include conversions of subexpressions as λ-conversions.
Here are some examples of λ-conversion.

	⊗⊗(λx.(λy.x))(a)(b) → (λy.a)(b) → a⊗

	⊗⊗(λx.(λy.y))(a)(b) → (λy.y)(b) → b⊗

	⊗⊗(λx.f(x(x)))(λx.f(x(x))) → f(λx.f(x(x)))(λx.f(x(x))))⊗.

Each of them will be important.

	A wff which is not subject to λ-conversion is said to
be in normal form.  Some λ-expressions can be converted successively
0 or more times and will eventually reach a normal form.  The first
two conversions above are reductions to a normal form.  The third
example is not reduced to normal form and its is clear that it can't
be because the argument of ⊗f on the right is just the whole expression
on the left, we have

	⊗⊗e → f(e) → f(f(e)) → f(f(f(e))) →⊗ etc.

	A wff may be converted in several ways, because several of
its subexpressions may be applications of a λ-expression to another
expression.  Whether a normal form is reached may depend on the
order in which reductions are done.  However, the Church-Rosser
theorem of 1936 asserts that if a normal form is reached by some
sequence of conversions, it is unique; different conversion paths
cannot give rise to different normal forms.  It can happen, however,
that some paths give a normal form and others don't.

	Church originally proposed the λ-calculus as a way of
formalizing the foundations of mathematics, and the original
version included logical operations and quantifiers.  Unfortunately,
this system proved inconsistent (any formula was provable) and
had to be abandoned.  However, Church was able to show that arithmetic
computations could be encoded as reductions of λ-expressions to
normal form, and it was shown in the 1930s that λ-computations and
Turing machine computations were equally powerful.

	When we compare reduction of λ-expressions to normal form
to LISP computations, it seems surprising that any significant
computing can be done with only λ-conversion.  There are apparently
no lists, no qa, qd or ⊗cons, no conditional expressions and
no recursion.  However, as we shall now show all these things can
be done in the λ-calculus.  While Church first showed how to do
computation in λ-calculus, the most straightforward and instructive
course for us is to use constructions due to Dana Scott that
imitate pure LISP.

	Before doing this, we need a slightly abbreviated notation.
All our functions have had just one argument, but this argument
could be another function, and this is enough to imitate functions
of several arguments.  The idea is to regard ⊗f(x,y) as just an
abbreviation of ⊗f(x)(y), i.e. to regard a function of two arguments
as a function of one argument (the first argument) whose value 
is a function of one argument (the second argument).  We then
also regard ⊗(λx_y.e) as an abbreviation of ⊗(λx.(λy.e)).

	We now proceed by identifying certain objects and functions
of LISP as certain λ-expressions.

	We begin with truth values and write

	⊗⊗true = (λx y.x) = (λx.(λy.x))⊗

and 

	⊗⊗false = (λx y.y) = (λx.(λy.x))⊗. 

So far this is quite arbitrary, and the motivation for these
identifications is not apparent.  Now we do conditional
expressions and write

	⊗⊗qif p qthen a qelse b = p(a)(b) = p(a,b)⊗, 

and we already get a return from our investment.  Namely,

	⊗⊗qif true qthen a qelse b = true(a)(b) → (λx y.x)(a,b) = a⊗, 

which is just the property this conditional expression should have.
Likewise

	⊗⊗qif false qthen a qelse b = false(a)(b) → (λx y.y)(a,b) = b⊗, 

which is also wanted.  Using this conditional expression notation,
we can define the propositional connectives as

	⊗⊗p ∧ q = qif p qthen q qelse false = p(q,false)⊗

	⊗⊗p ∨ q = qif p qthen true qelse q = p(true,q)⊗

	⊗⊗¬p = qif p qthen false qelse true = p(false,true)⊗,

and all the usual truth table reductions immediately follow by reducing
the λ-expressions to normal form.  For example

	⊗⊗true ∧ false = true(false,false) = (λx y.x)(false,false) → false⊗.

	Next we need ⊗car, ⊗cdr and ⊗cons.  We begin with versions that
meet part of our requirements and will use them to construct the final
versions.
We write

	⊗⊗cons1 = (λ x y z.z(x,y))⊗

	⊗⊗car1 = (λx.x(true))⊗

	⊗⊗cdr1 = (λx.x(false))⊗,

which seem mysterious, but it is comforting to observe that

	⊗⊗car1(cons1(e1,e2)) = (λx.x(true))(cons1(e1,e2)⊗

			→ cons1(e1,e2)(true)

			= (λx y z.z(x,y)(e1,e2,true)

			→ true(e1,e2)

			→ e1⊗

	Likewise

	⊗⊗cdr1(cons1(e1,e2)) →→ e2⊗

where →→ refers to eventual reduction rather than a single step.
Thus ⊗cons1 has one main property of a pairing operation in that
⊗car1 and ⊗cdr1 enable us to get the operands back.  However, we
lack atoms and a way of identifying them.

	Our system will have a single atom ⊗nl, and we define

	⊗⊗nl = cons1(true,true)⊗

	⊗⊗cons = (λx y.cons1(false,(cons1(x,y)))⊗

	⊗⊗car = (λx.car1(cdr1(x)))⊗

	⊗⊗cdr = (λx.cdr1(cdr1(x)))⊗

	⊗⊗null = (λx.car1(x))

The idea is that the atom ⊗nl has its ⊗car1 part ⊗true, while every
expression produced by ⊗cons has its ⊗car1 part ⊗false.  Therefore,
we use the ⊗car1 part in conditional expressions as a test for ⊗null. 

	Finally, to get recursion, we write

	⊗⊗recur = (λf.(λx.f(x(x)))(f(x(x))))⊗, 

and as mentioned above

	⊗⊗recur(f) → f(recur(f))⊗.

	Now we can do pure LISP.  We can write

	⊗⊗append = recur(λf.(λu v.qif null(u) qthen v qelse cons(car(u),
f(cdr(u),v))))⊗

and

	⊗⊗equal = recur(λf.(λx y.(null(x) ∧ null(y) ∨ (¬null(x) ∧ ¬null(y))
∧ (f(car(x),car(y)) ∧ f(cdr(x),cdr(y))))))⊗

	Using the above abbreviations we can show such facts as

	⊗⊗append(nl,nl) → nl⊗

	⊗⊗append(cons(nl,nl),nl) → cons(nl,nl)⊗

	⊗⊗equal(nl,nl) → true⊗

	⊗⊗equal(cons(nl,nl),nl) → false⊗.

	There is one catch.  The Church-Rosser theorem assures us that
if we get an answer it will be unique, but it doesn't assure us that
all sequences of reduction get answers, and some don't.  Suppose we are
computing ⊗append(nl,nl).  We have

	⊗⊗append(nl, nl) → recur(λf.(λx y. ...))(nl,nl)

			→ (λu.
**** fix this up

At this point two conversions are possible.  We can convert further
within the function, but this can go on indefinitely, or we can
apply what we have to the two arguments ⊗nl.  The latter is what
will lead to a normal form.  The general rule is to do first the
outermost possible reduction.  It can be shown that it leads to
a normal form whenever a normal form exists.

	The preceding shows how to translate any pure LISP function as
a wff of λ-calculus.  If we also encode its arguments and form
the wff consisting of the application of the function to its
arguments and then normalize ⊗e, the process will terminate
provided the LISP computation taken in the call-by-name sense
terminates, and the resulting normal form will be an encoding
of the result of the LISP computation.

	Here are some remarks.
.item←0

	#. Further definitions could give an improved language for
LISP in λ-calculus.  The way to get more atoms is to
regard a certain class of composite expressions as atoms just as
we did with ⊗nl.

	#. Defining conditional expression as ⊗p(a,b) requires
that the expression used as ⊗p evaluate to ⊗true or ⊗false.  Any
other value can produce nonsense.

	#. The LISP in λ-calculus formalism works only if all
expressions that occur in the computation are constructed in
the manner provided for.  Including general λ-expressions
can cause strange results when an attempt is made to normalize them.

	#. While encoding LISP in λ-calculus shows that λ-calculus
is universal in its computing capabilities, this doesn't seem to
be a good way of taking advantage of the ability of the λ-calculus
to express functions of functions to arbitrarily high orders.  Such
a way hasn't been found.